Learning r-of-k Functions by Boosting
نویسندگان
چکیده
We investigate further improvement of boosting in the case that the target concept belongs to the class of r-of-k threshold Boolean functions, which answer “+1” if at least r of k relevant variables are positive, and answer “−1” otherwise. Given m examples of a r-of-k function and literals as base hypotheses, popular boosting algorithms (e.g., AdaBoost) construct a consistent final hypothesis by using O(k logm) base hypotheses. While this convergence speed is tight in general, we show that a modification of AdaBoost (confidence-rated AdaBoost [SS99] or InfoBoost [Asl00]) can make use of the property of r-of-k functions that make less error on one-side to find a consistent final hypothesis by using O(kr logm) hypotheses. Our result extends the previous investigation by Hatano and Warmuth [HW04] and gives more general examples where confidence-rated AdaBoost or InfoBoost has an advantage over AdaBoost.
منابع مشابه
A Hybrid Framework for Building an Efficient Incremental Intrusion Detection System
In this paper, a boosting-based incremental hybrid intrusion detection system is introduced. This system combines incremental misuse detection and incremental anomaly detection. We use boosting ensemble of weak classifiers to implement misuse intrusion detection system. It can identify new classes types of intrusions that do not exist in the training dataset for incremental misuse detection. As...
متن کاملMachine Learning Models for Housing Prices Forecasting using Registration Data
This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...
متن کاملAttribute-efficient learning of decision lists and linear threshold functions under unconcentrated distributions
We consider the well-studied problem of learning decision lists using few examples when many irrelevant features are present. We show that smooth boosting algorithms such as MadaBoost can efficiently learn decision lists of length k over n boolean variables using poly(k, logn) many examples provided that the marginal distribution over the relevant variables is “not too concentrated” in an L2-no...
متن کاملOnline Gradient Boosting
We extend the theory of boosting for regression problems to the online learning setting. Generalizing from the batch setting for boosting, the notion of a weak learning algorithm is modeled as an online learning algorithm with linear loss functions that competes with a base class of regression functions, while a strong learning algorithm is an online learning algorithm with smooth convex loss f...
متن کاملBoosting and Hard-Core Sets
This paper connects two fundamental ideas from theoretical computer science: hard-core set construction, a type of hardness amplification from computational complexity, and boosting, a technique from computational learning theory. Using this connection we give fruitful applications of complexity-theoretic techniques to learning theory and vice versa. We show that the hard-core set construction ...
متن کامل